GeoTime and nSpace2

Oculus Info Inc.

VAST 2008 Challenge

Mini Challenge 1: Wiki Editors

Authors and Affiliations:

Adeel Khamisa, Oculus Info, akhamisa@oculusinfo.com
Greg Wiseman, Oculus Info, gwiseman@oculusinfo.com
Rob Harper, Oculus Info, rharper@oculusinfo.com

Student Team:

No

Tool(s):

GeoTime Configurable Spaces: The main tool we used to solve this challenge was GeoTime v2.6, released in 2008 by Oculus Info. GeoTime supports the visualization and analysis of entities and events over time and geography. Events are represented within an X,Y,T coordinate space, in which the X,Y plane shows geographic space and the vertical axis represents time. Entity movements, event, relationships, and interactions over time within a spatial context can be easily seen and understood. Events animate through this 3-D space as the time is played through. For analysts, GeoTime's combined spatiotemporal display amplifies the concurrent cognition of entity relationships and behaviors in space and time. Analysts can see the who and what in the where and when. GeoTime includes keyword search, link analysis, imagery display, geometry display, annotation and numerous other analytical functions. [Kapler, T., R. Eccles, R. Harper, W. Wright, Configurable Spaces: Temporal Analysis in Diagrammatic Contexts,
Accepted for IEEE VAST 2008 Conference. [See Kapler, Thomas, and William Wright, 2004. GeoTime Information Visualization, IEEE Symposium on Information Visualization.]

Excel Visualizer: Oculus Excel Visualizer is a Microsoft Excel® extension designed to give users immediate understanding of the data that drives their business intelligence. By leveraging the ubiquity, power and ease-of-use provided by Excel spreadsheets, Oculus has created a new paradigm in rapid data visualization. Users can now take advantage of our integrated charting capabilities to provide new views on data to provide further insight and comprehension.

nSpace2: nSpace2 is the web version of nSpace, an environment supporting the whole analytical workflow from brainstorming and creating hypotheses, to querying, scanning, comparing, reading and annotating, evidence marshaling and reasoning, to evidence assessment, collaboration and reporting. It has two main components: TRIST, focused on information triage, and the Sandbox, for evidence marshaling and analytical sense-making. nSpace2 (still in beta) isthe advanced web version of nSpace; however, it currently has only a subsetof the capabilities that its parent nSpace has. Nevertheless, its core initial capabilities and in particular its strength in supporting multiple analysts working on related projects were definitively key to this team’s analytical process.[See Wright, W., Schroh, D., Proulx, P., Skaburskis, A., and Cort, B. The Sandbox for Analysis – Concepts and Methods, paper accepted for ACM CHI 2006.]

Two Page Summary: NO

ANSWERS:


Video

Mini Challenge #1 Wiki Video

Wiki-1: What are the factions represented in the edit pages and who are its members? In other words, describe the groups and their members based on their editing changes.

Detailed Answer:

Groups  
Pro Paraiso

VictoriaV; RyogaNica; Amado; Savanna;

Con Paraiso

Extreme Refuters:

82.152.249.x; 66.66.125.x; Alejo; Rm99;

Vandals:

75.179.21.x; 201.226.51.x; 71.59.210.x; 204.52.215.x; 68.60.74.x; 131.174.244.x; 74.120.3.x; 69.14.85.x; 24.168.142.x; Cristofer; 86.151.194.x; Alphanzo; 75.81.8.x; 128.125.81.x; Absalon; 195.113.65.x; Molotover; 67.55.3.x; 84.158.202.x; 209.155.27.x; Alejandrosanchez; 66.175.135.x;

Skeptics of Paraiso

Agustin; DailosTamanca;

Neutral Editors

Sara; BakBot; Seina; Soccoro; Estirabot; Sarita; Edemir; Kurrop; Ricarda;

In order to organize the data in a structured format, the unstructured text file containing the Wikipedia Edit logs were imported into Excel using a parser. Data was sorted into labeled columns: editor’s name, date/time of edit; description of edit, and file size. The parser also performed a calculation to find out the change in file size after each edit and created an associated column. To discover any numerical patterns, this data was visualized in Oculus Excel Visualizer where we observed two patterns in the editing of the Paraiso Wikipedia, see Figure 1a Deletion and Restoration of the Record and Figure 1b Subsequent offsetting records (editing wars).

Figure 4

Figure 1a: File changes over 20 000 bytes stand out showing a deletion and restoration of the entire record.

Figure 1b: A close view of File Changes under 20 000 bytes reveals patterns indicating editing wars; subsequent offsetting changes in file size by alternating editors.

After reading the descriptions of the edits, we found the first pattern suggested instances of vandalism where corrections are made by bots. The second pattern was of subsequent offsetting changes in file size and are indicative of editing wars. These two patterns in the file showed differences of opinion; an editor either supported another editor’s claims or refuted them. The relationships between editors were recorded in a spreadsheet in two columns (Refutes, Supports) based on two scenarios:

  1. If A opposes G, then A refutes G
  2. If P restores E’s prior version, then P supports E.
Editor Refutes Supports
A G  
P   E

Table 1: Example of structure extracted from wiki edits page to create visualizations shown in Figures 2 and 3

 

In order to view factions between editors who supported and refuted each other, the above spreadsheet was imported into GeoTime Configurable Spaces. Two social network diagrams were created: Wiki Refutes (Figure 2) and Wiki Supports (Figure 3).

In GeoTime Configurable Spaces we used the CircularLayouter API algorithm to create the layout. The algorithm generates a circular tree diagram where parent nodes are encircled by children nodes.

Figure 3 Figure 4

Figure 2 Wiki Refutes: Shows editors who disagree with each other are connected with red lines. Notice, DailosTamanca, Agustin and Rm99 all disagree with VictoriaV quite often.

Figure 3: Shows editors who agree. Notice Savanna and VictoriaV share a strong agreement. Rm99 and DailosTamanca agree as well, but not as often this suggests a slight difference in opinion.

Although these two diagrams are useful for focusing on disagreement between individuals, another diagram outlines members and their factions.

The Wiki Edits GeoTime Configurable (GT Config Spaces) Graph: Defining the factions and temporal patterns

Figure 2

Figure 4: Shows the evolution of factions over time. The ground plane shows members' final positions within these factions. The answer provided below describes how this graph was created.

Reading the Wiki edits discussion page determined some of the points of division that could exist between members. The largest was determined to be the legitimacy of Paraiso as a positive religion. However, a metric was needed to measure the Pro and Con state which could then in turn be visualized, as shown in Figure 4.

Pro Paraiso and Con Paraiso in time

The Wiki Edits GT Config Spaces graph visualizes behavioral patterns and groups them into factions by measuring net claims on the X axis (net claims = net unchallenged edits (supports) – net challenged edits (refutes)) and sentiment towards the Paraiso movement (+Y = For Paraiso, -Y = Against Paraiso, Y=0 Neutral/Indifferent) on the Y axis. All measurements take place in time which is measured on the Z axis.

All editors start at point (0,0) on the GT Config Spaces graph and drift in the 3D editor sentiment space based on their editing comments. Viewing these two variables across time we can see factions develop. The measurement is performed on those who refute and/ or support each other.

Sentiment towards Paraiso: The Y axis in the ground plane

The reiterative process added one measurement to determine who was Pro or Con Paraiso. This was determined by manually reading the filtered record. A negative Y value indicates that one is Con Paraiso, and a positive indicates one is Pro Paraiso.

Net Claims: The X axis in the ground plane

To distinguish between members in sub-factions we use the measure net claims. The net claims reveal how accepted a user’s views are on Paraiso. The Refutes column created for Wiki Refutes already measured challenged edits, and the Supports column already measured unchallenged edits. From this we derive the net claims measurement measured on the X axis. A large negative X value indicates extreme bias, where a positive X value indicates acceptance.

Time: The vertical Z axis

Temporal patterns were observed to distinguish behavioral characteristics and sub-factions. This is clearly seen in Figure 7, and Figure 8 where certain attitudes only exist for short periods of time and others change.

For example, VictoriaV is evaluated as being Pro Paraiso where Rm99 is adamantly Con Paraiso.

Figure 3 Figure 4 Figure 5

Figure 5a:VictoriaV challenges a negative comment by Rm99 towards the Paraiso movement and shows her support for Paraiso. VictoriaV moves right on the X axis once for her correcting statement, and up the Y axis for her positive sentiment once. Rm99 the opposite.

Figure 5b: Rm99 challenges VictoriaV’s challenge with a negative statement towards the Paraiso movement. Rm99 moves left, moving positively on the X axis undoing VictoriaV’s previous correction, but moves further along the Y for his negative sentiment. VictoriaV is challenged so she moves towards the left from her previous position on the X axis and upwards along the Y for her Pro Paraiso stance.

Figure 5c: Over time refuting events repel VictoriaV and Rm99 in opposing directions. It becomes quite clear that VictoriaV is Pro Paraiso and Rm99 is Con Paraiso. A similar measurement is made with all editors.

Neutral Editors and Vandals
Figure 6: Over time BakBot’s corrections move itself along the X axis. All neutral editors show strong movements along the X axis. Whenever an editor restores the record back to one of BakBot’s changes Backbot moves along the X axis as well for being supported. Users who delete the record seem to only do so once and are corrected by BakBot only once. These editors (vandals) tend to group at the point (-1,-1) on the graph.

Figure 3 Figure 4 Figure 5

Figure 6: The Neutral corrections of BakBot help identify a faction within Con Paraiso: Vandals. Neutral editors move for long stretches along the X axis, while vandals cluster around the point (-1,-1)

Figure 7: Extreme Refuters of Paraiso inhabit (-X, -Y) region. Unlike the Vandals they make more than one edit; however, most of them do not edit for more than a few days in September.

Figure 8: Skeptics of Paraiso inhabit the (+X,-Y) region of the graph. The positive X movement suggests that their views are accepted. This differentiates them from the Extreme Refuters who are deeply biased and have negative X movement.

Extreme Refuters
Figure 7: Another faction within the Con Paraiso movement are Extreme Refuters. They are often corrected for their extreme bias against Paraiso and end up on the (-X,-Y) portion of the chart. This indicates that their views are negative towards Paraiso and are viewed as unreasonable by the Wikipedia editing community.

Skeptics of Paraiso
Within the Con Paraiso movement there are individuals who end up in the (+X, -Y) region. Although they speak against Paraiso their criticism seems to be accepted. As a result their claims remain largely unchallenged. Figure 8 shows this relationship. DailosTamanca’s stretches of neutral edits move him in the positive direction on the X axis. Agustin displays the same trend however it is not as pronounced.

Wiki-2: Is the Paraiso movement involved in violent activities?

Yes

List of wiki edits providing evidence

# (cur) (last) 09:52, 4 September 2006 Barfly2001 (Talk | contribs) (93,491 bytes) (?See also - {{wikinews|Belgian justice prosecutes Paraiso}})

# (cur) (last) 09:26, 4 September 2006 Angelgasperi (Talk | contribs) (93,439 bytes) (?Controversy and criticism - Belgium prosecuting, wikinews source)

# (cur) (last) 09:52, 4 September 2006 Barfly2001 (Talk | contribs) (93,491 bytes) (?See also - {{wikinews|Belgian justice prosecutes Paraiso}})

# (cur) (last) 03:16, 19 September 2006 Alphanzo (Talk | contribs) m (moved Paraiso to GUNNED DOWN SIX DOCTORS AND NURSES IN COLD BLOOD)

# (cur) (last) 11:58, 18 November 2006 Amado (Talk | contribs) (114,196 bytes) (Deleted false statement. A person can only be declared afther it has been proven in a Justicia Juicio that he commited a high crime. Intro to C. Ethics 1998 is no longer used refer to 2006 edition only)

Short Answer (150 words max):

After filtering the record for editing wars the term “controversy and criticism” kept reappearing along with country names and obscenities. A parser was used to do a key word search on violent terms creating a column called Potential Violence. If a key word appeared in the description field , then the associated Potential Violence Column was marked TRUE .

There records were sorted in nSpace2’s Sandbox where edits were grouped and filtered for relevancy.

 

Figure 2

Figure 1:Relevant facts are sorted in groups

From here facts were weighed in an assertion (Figure 2) in order to see if there were enough facts to determine potential violence.

 

Figure 2

Figure 2:Weighing facts in an assertion box shows that there is evidence to suggest that Paraiso